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Abstract We present in this paper new multiscale transforms on the sphere, namely 
the isotropic undecimated wavelet transform, the pyramidal wavelet transform, 
the ridgelet transform and the curvelet transform. All of these transforms can be 
inverted i.e. we can exactly reconstruct the original data from its coefficients in 
either representation. Several applications are described. We show how these trans- 
forms can be used in denoising and especially in a Combined Filtering Method, 
which uses both the wavelet and the curvelet transforms, thus benefiting from the 
advantages of both transforms. An application to component separation from mul- 
tichannel data mapped to the sphere is also described in which we take advantage 
of moving to a wavelet representation. 

Key words. Cosmology : cosmic microwave background, early universe, Methods: Data 
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1. Introduction 



Wavelets in Astronomy 



2002) which have 



Wavelets are very popular tools in astronomy (St arck and M urtagh. 
led to very impressive results in denoising and detection applications. For instance, 
both the Chandra and the XMM data centers use wavelets for the detection of ex- 
tended sources in X-ray images. For denoising and deconvolution, wavel ets have also 



demonstrated how powerful they are for discriminating signal from noise IjStarck et al 



2002b). In cosmology wavelets h ave been used in many studies such as for analyzing 



the spatial d i stribution of galaxies ( Slezak et al 



Starck et al 



IIRocha et al 



1999 



2005a; 
I2OO4L 



Martinez et alJ 



1993: 



Escalera and MacGillivrav 



1995 



Barrciro ct al 



2005), determining the topo logy of the universe 
detect i ng non-Gaus s ianity in the CMB maps llAghanim and Forni , 



2001 



Vielva et al 



primordia l power spectrum Mukh eriee and Wans , 



2004: 



Starck et al 




2004J), reconstructing the 



measuring the galaxy power 




or reconstructing weak lensing mass maps 
2005bJ). It has also been shown that noise is a problem of major concern for N-body sim- 
ulations of structure formation in the early Universe and that using wavelets to remove 
noise from N-b ody simulations is equivale nt to simulations with two orders of magnitude 



more particles l|Romeo et al 



2003, 



2004). 



The most popular wavelet algorithm in astrophysical applications is the so-called "a 
trous algorithm" . It is an isotropic undecimated wavelet transform in which the scaling 
function used is a Box-Spline of order three. The isotropy of the wavelet function makes 
this decomposition optimal for the detection of isotropic objects. The non decimation 
makes the decomposition redundant (the number of coefficients in the decomposition is 
equal to the number of samples in the data multiplied by the number of scales) and 
allows us to avoid Gibbs aliasing after reconstruction in image restoration applications, 
as generally occurs with orthogonal or bi-orthogonal wavelet transforms. The choice of 
a _B3-spline is motivated by the fact that we want an analyzing function close to a 
Gaussian, but verifying the dilation equation, which is required in order to have a fast 
transformation. One last property of this algorithm is to provide a very straightforward 
reconstruction. Indeed, the sum of all the wavelet scales and of the coarsest resolution 
image reproduces exactly the original image. 



When analyzing data that contains anisotropic features, wavelets are no longer opti- 
mal and this has motivated the devel opment of new multiscale decompositions such as the 



ridgelet and the curvelet tr ansforms ( Donoh o and Duncan 



Starck et al. l|Starck et al 



2000; 



Starck et al 



2002a). In 



20041) , it has been shown that the curvelet transform could be 
useful for the detection and the discrimination of non Gaussianity in CMB data. In this 
area, further insight will come from the analysis of full-sky data mapped to the sphere 
thus requiring the development of a curvelet transform on the sphere. 



Wavelets on the sphere 



Several wa velet transforms on the sphere have been proposed in recent years. Schroder 



and Sweldens ( Sc hroder and Swelden; 



1995) have developed an orthogonal wavelet trans- 



form on the sphere based on the Haar wavelet function. Its interest is however rela- 
tively limited because of the poor properties of the Haar function and the problems 
inherent to the orthogonal decomposition. Many papers descr i be new continuous wavelet 



transforms IjAntoine > 



1999: 



Tenorio et al 



1999 



Cavon et al 



2001 



Holschncidcr 



1996) 



and the recent detection of non-Gaussianity in CMB was obtained by Vielva et al. 



Vielva et al 



2004) using the continuous Mexican Hat wavelet transform ( Cay^n^et^l 



2001). These works have been extended to directional wavelet transforms ( Antoine et al 



2002: McEwen et al 



2004: 



Wiaux et al 



2005). All these new continuous wavelet decom- 



positions are interesting for data analysis, but cannot be used for restoration purposes 
because of the lack of an inverse transform. Only the algorithm propos ed by Freeden 



and Maier IjFreeden and Windheuser 



1997 



Freeden and Schneider 



1998), based on the 



Spherical Harmonic Decomposition, has an inverse transform. 

The goal of this paper is to extend existing 2D multiscale decompositions, namely 
the ridgelet transform, the curvelet transform and the isotropic wavelet transform, which 
work on flat images to multiscale methods on the sphere. In sectional we present a new 
isotropic wavelet transform on the sphere which has similar properties to the d trous 
algorithm and therefore should be very useful for data denoising and deconvolution. 
This algorithm is d irectly derived from the FFT-based wavelet transform proposed in 
aperture synthesis image re storation, and is relatively close to 



Starck et al 



1 1994(1 for 



the Freeden and Maier (Freeden and Schneider 



1998) method, except that it features the 
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same straightforward reconstruction as the a trous algorithm (i.e. the sum of the scales 
reproduces the original data). This new wavelet transform also can be easily extended 
to a pyramidal wavelet transform, hence with reduced redundancy, a possibility which 
may be very important for larger data sets such as expected from the future Planck- 
Surveyor experiment. In section we show how this new decomposition can be used to 
derive a curvelet transform on the sphere. Section ^describes how these new transforms 
can be used in denoising applications and introduces the Combined Filtering Method, 
which allows us to filter data on the sphere using both the Wavelet and the Curvelet 
transform s. In secti on [5] we show that the independent component analysis method 



wSMICA (|Moudden et al 



2005(1 . which was designed for 2D data, can be extended to 



data on the sphere. 



2. Wavelet transform on the sphere 

2.1. Isotropic Undecimated Wavelet Transform on the Sphere (UWTS) 

There are clearly many different possible implementations of a wavelet transform 
on the sphere and their performance depends on the application. We describe here an 
undecimated isotropic transform which has many properties in common with the d trous 
algorithm, and is therefore a good candidate for restoration applications. Its isotropy 
is a favorable property when analyzing a statistically isotropic Gaussian field such as 
the CM B or data sets such as maps of galaxy clusters, which contain only isotropic 



features IjStarck et al 



1998). Our isotropic transform is obtained using a scaling function 
<j>i a ('d,(p) with cut-off frequency l c and azimuthal symmetry, meaning that 4>i c does not 
depend on the azimuth ip. Hence the spherical harmonic coefficients <pi a (/, m) of 0; c vanish 
when m ^ so that : 



z=z c 



(1) 



1=0 



where the Yi t7n are the spherical harmonic basis functions. Then, convolving a map /($, if) 
with <j)i c is greatly simplified and the sph erical harmonic coefficie nts do(l,m) of the re- 
sulting map co(d, ip) are readily given by (|Bogdanova et all 120051: 



co(Z,m) = (f>i c * f(l,m) 



Air 
21 + 1 

where * stands for convolution. 



cf>l c (l,0)f(l,m) 



(2) 



Multiresolution decom post ion 

A sequence of smoother approximations of / on a dyadic resolution scale can be 
obtained using the scaling function (f>i c as follows 

co = 4>l c * / 

Cl = 4>2-U c * f 



*/ 



(3) 



where 4>2-H c is a rescaled version of <j>i c with cut-off frequency 2~H C . The above multi- 
resolution sequence can also be obtained recursively. Define a low pass filter hj for each 
scale j by 

<t> i c (l,m) 

■PL - if I < Ifr and m = 
■M*."0=< V° (4) 
otherwise 



Hj(l, m) 



47T 



21 + 1 3 

It is then easily shown that Cj+i derives from Cj by convolution with /i.,-: Cj+i = Cj * hj. 



The wavelet coefficients 

Given an axisymmetric wavelet function tjji c , we can derive in the same way a high 
pass filter gj on each scale j: 



47T 



Gj(l,m) = y 2f7T[9j(l,™>) = < 



if I < tjItt and m = 



if i > 2jfr an d m = 



(5) 



1 

otherwise 

Using these, the wavelet coefficients Wj+i at scale j + 1 are obtained from the previous 
resolution by a simple convolution: Wj + \ = Cj * gj. 



Just as with the a trous algorithm, the wavelet coefficients can be defined as the 
difference between two consecutive resolutions, Wj+\{&, (p) = Cj(d, ip) — Cj+i(#, <p), which 
in fact corresponds to making the following specific choice for the wavelet function tpi c : 



ipu{l,m) — 4> i c (7, m) — (j>i^(l,m) 

2J 23 



(6) 
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The high pass filters gj defined above are, in this particular case, e: 



xpressed as: 



Giihm) 




21 + 1 



gj(l,m) = 1 




21 + 1 



hj(l, to) = 1 — Hj(l, to) 



(7) 



Obviously, other wavelet functions could be used as well. 

Choice of the scaling function 

Any function with a cut-off frequency is a possible candidate. We retained here a 
B-spline function of order 3. It is very close to a Gaussian function and converges rapidly 



Figure 1. On the left, the scaling function <f> and, on the right, the wavelet function ip. 

In Fig. the chosen scaling function derived from a £?-spline of degree 3, and its 
resulting wavelet function, are plotted in frequency space. 



1. Compute the i?3-spline scaling function and derive tp, h and g numerically. 

2. Compute the corresponding Spherical Harmonics of image cq. We get Co. 

3. Set j to 0. Iterate: 

4. Multiply Cj by Hj. We get the array Cj+i. 

Its inverse Spherical Harmonics Transform gives the image at scale j + 1. 

5. Multiply Cj by Gj. We get the complex array Wj+i. 

The inverse Spherical Harmonics transform of Wj+i gives the wavelet coefficients Wj+i at 
scale j + 1. 

6. j=j+l and if j < J, return to Step 4. 

7. The set {wi,W2, . . . , wj, cj} describes the wavelet transform on the sphere of Co. 



to 0: 



<f>i c (l,m 




(8) 



where B{x) = ± (| x - 2 | 3 - 4| x - 1 | 3 + 6| x | 3 - 4| x + 1 | 3 + | x + 2 | 3 ). 



The numerical algorithm for the undecimatcd wavelet transform on the sphere. 
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Reconstruction 

If the wavelet is the difference between two resolutions, Step 5 in the above UWTS 
algorithm can be replaced by the following simple subtraction Wj + i = cj — cj + \. In this 
case, the reconstruction of an image from its wavelet coefficients W = {w\, . . . , wj, cj} 
is straightforward: 



This is the same reconstruction formula as in the a trous algorithm: the simple sum 
of all scales reproduces the original data. However, since the present decomposition is 
redundant, the procedure for reconstructing an image from its coefficients is not unique 
and this can profitably be used to impose additional constraints on the synthesis functions 
(e.g. smoothness, positivity) used in the reconstruction. Here for instance, using the 
relations: 

Cj+i(l, m) = Hj(l,m)cj(l,m) 

Wj + i(l,m) — Gj(l,m)cj(l,m) (10) 
a least squares estimate of Cj from Cj + \ and Wj + \ gives: 

Cj = c j+1 Hj + wj+iGj (11) 
where the conjugate filters Hj and Gj have the expression: 



J 




(9) 




(12) 



(13) 



and the reconstruction algorithm is: 
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1. Compute the i?3-spline scaling function and derive ij), h, g, h, g numerically. 

2. Compute the corresponding Spherical Harmonics of the image at the low resolution cj. We get 

&j. 

3. Set j to J — 1. Iterate: 

4. Compute the Spherical Harmonics transform of the wavelet coefficients Wj+i at scale j + 1. 
We get Wj+i. 

5. Multiply Cj+i by Hj. 

6. Multiply Wj+i by Gj. 

7. Add the results of steps 6 and 7. We get Cj. 

8. j=j-l and if j > 0, return to Step 4. 

9. Compute The inverse Spherical Harmonic transform of co 



The synthesis low pass and high pass filters h and g are plotted in Fig. 

Figure 2. On the left, the filter h, and on the right the filter g. 

Figure[3]shows the WMAP data (top left) and its undecimated wavelet decomposition 
on the sphere using five resolution levels. Figures El middle left, middle right and bot- 
tom left show respectively the four wavelet scales. Figure |31 bottom right shows the last 
smoothed array. Figure 0] shows the backprojection of a wavelet coefficient at different 
scales and positions. 

2.2. Isotropic Pyramidal Wavelet Transform on the Sphere (PWTS) 

In the previous algorithm, no downsampling is performed and each scale of the wavelet 
decomposition has the same number of pixels as the original data set. Therefore, the 



Figure 3. WMAP data and its wavelet transform on the sphere using five resolution lev- 
els (4 wavelet scales and the coarse scale). The sum of these five maps reproduces exactly 
the original data (top left). Top: original data and the first wavelet scale. Middle: the 
second and third wavelet scales. Bottom: the fourth wavelet scale and the last smoothed 
array. 



9 



Figure 4. Backprojection of a wavelet coefficient at different scales. Each map is obtained 
by setting all but one of the wavelet coefficients to zero, and applying an inverse wavelet 
transform. Depending on the scale and the position of the non zero wavelet coefficient, 
the reconstructed image presents an isotropic feature with a given size. 



Figure 5. WMAP data (top left) and its pyramidal wavelet transform on the sphere 
using five resolution levels (4 wavelet scales and the coarse scale). The original map can 
be reconstructed exactly from the pyramidal wavelet coefficients. Top: original data and 
the first wavelet scale. Middle: the second and third wavelet scales. Bottom: the fourth 
wavelet scale and the last smoothed array. The number of pixels is divided by four at 
each resolution level, which can be helpful when the data set is large. 

number of pixels in the decomposition is equal to the number of pixels in the data 
multiplied by the number of scales. For applications such as PLANCK data restoration, 
we may prefer to introduce some decimation in the decomposition so as to reduce the 
required memory size and the computation time. This can be done easily by using a 
specific property of the chosen scaling function. Indeed, since we are considering here a 
scaling function with an initial cut-off l c in spherical harmonic multipolc number I, and 
since the actual cut-off is reduced by a factor of two at each step, the number of significant 



10 Wavelets, Ridgelets and Curvelets on the Sphere 

spherical harmonic coefficients is then reduced by a factor of four after each convolution 
with the low pass filter h. Therefore, we need fewer pixels in the direct space when we 
compute the inverse spherical harmonic transform. Using the Healpix pixelization scheme 
this can be done easily by dividing by 2 the nside parameter when 
resorting to the inverse spherical harmonic transform routine. 

Figure [5] shows WMAP data (top left) and its pyramidal wavelet transform using five 
scales. As the scale number increases (i.e. the resolution decreases), the pixel size becomes 
larger. Figures top right, middle left, middle right and bottom left show respectively 
the four wavelet scales. Figure bottom right shows the last smoothed array. 



3. Curvelet Transform on the Sphere (CTS) 

3.1. Introduction. 



The 2D curvelet transform, proposed in 



(2002a|) and 



Starck et al 



Donoho and Duncan (20O 



Starck et al 



I 2003a) enables the directional analysis of an image in differ- 
ent scales. The fundamental property of the curvelet transform is to analyze the data 
with functions of length of about 2~ J / 2 for the j th sub-band [2-?, 2 J +1 1 of the two dimen- 



sion al wavelet transform. Following the implementation described in 
and IStarck et al 



Starck et al 



(2002a) 



2003a), the data first undergoes an Isotropic Undecimated Wavelet 
Transform (i.e. a trous algorithm). Each scale j is then decomposed into smoothly over- 
lapping blocks of side-length Bj pixels in such a way that the overlap between two 
vertically adjacent blocks is a rectang ular array of size Bj x Bj/2. Finally, the ridgelet 



transform IjCandes and Don oho. 



1999) is applied on each individual block which amounts 



to applying a 1-dimensional wavelet transform to the slices of its Radon transform. More 
detai ls on the implementation o f the digital curvelet transform can be found in Starck 



et al ljStarck et al 



2002a, 



2003a). It has been shown that the curvelet transform could 



be very useful for the detect ion and the discrimination of sources of non-Gaussianity 



in CMB ijStarck et al 



2003b). The curvelet transform is also redundant, with a redun- 
dancy factor of 16 J + 1 whenever J scales are employed. Its complexity scales like that 
of the ridgelet transform that is as 0(n 2 log 2 n). This method is best for the detection of 
anisotropic structures and smooth curves and edges of different lengths. 
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3.2. Curvelets on the Sphere. 

The Curvelet transform on the sphere (CTS) can be similar to the 2D digital curvelet 
transform, but replacing the a trous algorithm by the Isotropic Wavelet Transform on 
the Sphere previously described. The CTS algorithm consists of the following three steps 



— Isotropic Wavelet Transform on the Sphere. 

— Partitioning. Each scale is decomposed into blocks of an appropriate scale (of side- 
length ~ 2~ s ), using the Healpix pixelization. 

— Ridgelet Analysis. Each square is analyzed via the discrete ridgelet transform. 



Partitioning using the Healpix representation. 

The Healpix representation is a curvilinear partition of the sphere into quadrilateral 
pixels of exactly equal area but with varying shape. The base resolution divides the 
sphere into 12 quadrilateral faces of equal area placed on three rings around the poles 
and equator. Each face is subsequently divided into nside 2 pixels following a hierarchical 
quadrilateral tree structure. The geometry of the Healpix sampling grid makes it easy to 
partition a spherical map into blocks of a specified size 2™. We first extract the twelve 
base-resolution faces, and each face is then decomposed into overlapping blocks as in 
the 2D digital curvelet transform. With this scheme however, there is no overlapping 
between blocks belonging to different base-resolution faces. This may result in blocking 
effects for example in denoising experiments via non linear filtering. A simple way around 
this difficulty is to work with various rotations of the data with respect to the sampling 
grid. 



Ridgelet transform 



Once the partitio ning is performed, the standard 2D ridgelet transform described in 



Starck et al 



I 2003a|) is applied in each individual block 



1. Compute the 2D Fourier transform. 

2. Extract lines going through the origin in the frequency plane. 
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Figure 6. Flowgraph of the Ridgelet Transform on the Sphere. 



Figure 7. Backprojection of a ridgelet coefficient at different scales and orientations. 
Each map is obtained by setting all but one of the ridgelet coefficients to zero, and 
applying an inverse ridgelet transform. Depending on the scale and the position of the 
non zero ridgelet coefficient, the reconstructed image presents a feature with a given 
width and a given orientation. 

3. Compute the ID inverse Fourier transform of each line. This achieves the Radon 
transform of the current block. 

4. Compute the ID wavelet transform of the lines of the Radon transform. 

The first three steps correspond to a Radon transform method called the linogram. Other 
implementations of the R adon transform, such as the Slant Stack Radon Transform 



I Donoho and Flesia , 



2002), can be used as well, as long as they offer an exact recon- 
struction. 

Figure shows the flowgraph of the ridgelet transform on the sphere and Figure [7| 
shows the backprojection of a ridgelet coefficient at different scales and orientations. 



3.3. Algorithm 

The curvelet transform algorithm on the sphere is as follows: 

1. Apply the isotropic wavelet transform on the sphere with J scales, 

2. Set the block size B\ = B min , 

3. For j = 1,. .., J do, 

— partition the subband Wj with a block size Bj and apply the digital ridgelet 
transform to each block, 

— if j modulo 2 = 1 then Bj + i — 2Bj, 

— else Bj + i = Bj. 

The sidelength of the localizing windows is doubled at every other dyadic subband, hence 
maintaining the fundamental property of the curvelet transform which says that elements 
of length about 2~ J / 2 serve for the analysis and synthesis of the j-th subband [2 J , 2 J+1 ]. 
We used the default value B m i n = 16 pixels in our implementation. Figure [S] gives an 
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Figure 8. Flowgraph of the Curvelet Transform on the Sphere. 

Figure 9. Backprojection of a curvelet coefficient at different scales and orientations. 
Each map is obtained by setting all but one of the curvelet coefficients to zero, and 
applying an inverse curvelet transform. Depending on the scale and the position of the 
non zero curvelet coefficient, the reconstructed image presents a feature with a given 
width, length and orientation. 

overview of the organization of the algorithm. Figure |5] shows the backprojection of 
curvelet coefficients at different scales and orientations. 

3.4. Pyramidal Curvelet Transform on the Sphere (PCTS) 

The CTS is very redundant, which may be a problem in handling huge data sets 
such as the future PLANCK data. The redundancy can be reduced by substituting, in 
the curvelet transform algorithm, the pyramidal wavelet transform to the undecimated 
wavelet transform. The second step which consists in applying the ridgelet transform on 
the wavelet scale is unchanged. The pyramidal curvelet transform algorithm is: 
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1. Apply the pyramidal wavelet transform on the sphere with J scales, 

2. Set the block size B\ = B m i n , 

3. For j = 1,..., J do, 

— partition the subband Wj with a block size _Bj and apply the digital ridgelet 
transform to each block, 

— if j modulo 2 = 2 then -Bj+i = Bj/2, 

— else Bj + i = By 

In the next section, we show how the pyramidal curvelet transform can be used for image 
filtering. 

4. Filtering 

4.1. Hard thresholding 



filtering or thresholding methods Iptarck and Murtagli , 



in image denoising 


via non 


2002 


Starck et al.. 


2002a) 



thresholding, for instance, consists in setting all insignificant coefficients (i.e. coefficients 
with an absolute value below a given threshold) to zero. In practice, we need to estimate 
the noise standard deviation Oj in each band j and a wavelet (or curvelet) coefficient uij is 
significant if | Wj |> k<jj, where k is a user-defined parameter, typic ally chosen between 3 



and 5. The cr, estimation in band j can be derived from simulations Iptarck and Murtagh , 



20021) . Denoting as D the noisy data and S the thresholding operator, the filtered data 
D are obtained by : 

D = K5(TD) (14) 

where T is the wavelet (resp. curvelet) transform operator and 1Z is the wavelet (resp. 
curvelet) reconstruction operator. 

Figure HUI describes the setting and the results of a simulated denoising experiment : 
upper left, the original simulated map of the synchrotron emission (renormalized between 
and 255); upper right, the same image plus additive Gaussian noise (<r = 5); middle, 
the pyramidal wavelet filtered image and the residual (i.e. noisy data minus filtered 
data); bottom, the pyramidal curvelet transform filtered image and the residual. A 5a j 
detection threshold was used in both cases. On such data, displaying very anisotropic 
features, using curvelets produces better results than the wavelets. 
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Figure 10. Denoising. Upper left and right : simulated synchrotron image and same 
image with an additive Gaussian noise (i.e. simulated data). Middle: pyramidal wavelet 
filtering and residual. Bottom: pyramidal curvelet filtering and residual. On such data, 
displaying very anisotropic features, the residual with curvelet denoising is cleaner than 
with the wavelet denoising. 

Figure 11. Denoising. Combined Filtering Method (pyramidal wavelet and pyramidal 
curvelet) and residual. 

Figure 12. Combined Filtering Method, face 6 in the Healpix representation 
of the image shown in Figure 1111 From top to bottom and left to right, 
respectively the a) original image face, b) the noisy image, c) the combined 
filtered image, d) the combined filtering residual, e) the wavelet filtering 
residual and f) the curvelet filtering residual. 

4.2. The Combined Filtering Method on the Sphere 



Method Error Standard Deviation SNR (dB) 

Noisy map 5. 13.65 

Wavelet 1.30 25.29 

Curvelet 1.01 27.60 

CFM 0.86 28.99 

Table 1. Table of error standard deviations and SNR values after filtering the syn- 
chrotron noisy map (Gaussian white noise - sigma = 5 ) by the wavelet, the curvelet and 
the combined filtering method. Images are available at "http:/ /jstarck. free.fr/mrs. html". 



The residual images for both the wavelet and curvelet transforms shown in Figure ITUI 
show that wavelets do not restore long features with high fidelity while curvelets are 
seriously challenged by isotropic or small features. Each transform has its own area of 
expertis e and this complem entarity is of great potential. The Combined Filtering Method 
(CFM) Iptarck et all 12001^1 allows us to benefit from the advantages of both transforms. 
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This iterative method detects the significant coefficients in both the wavelet domain and 
the curvelet domain and guarantees that the reconstructed map will take into account 
any pattern detected as significant by either of the transforms. A full description of the 
algorithm is given in Appendix 1 . Figure 1111 shows the CFM denoised image and its 
residual. Figure IT51 shows one face (face 6) of the following Healpix images: upper left, 
original image; upper right, noisy image; middle left, restored image after denoising by the 
combined transform; middle right, the residual; bottom left and right, the residual using 
respectively the curvelet and the wavelet denoising method. The results are reported 
in Table ^ The residual is much better when the combined filtering is applied, and 
no feature can be detected any longer by eye in the residual. Neither with the wavelet 
filtering nor with the curvelet filtering could such a clean residual be obtained. 

5. Wavelets on the Sphere and Independent Component Analysis 

5.1. Introduction 

Blind Source Separation (BSS) is a problem that occurs in multi-dimensional data 
processing. The goal is to recover unobserved signals, images or sources S from mixtures 
X of these sources observed typically at the output of an array of m sensors. A simple 
mixture model would be linear and instantaneous with additive noise as in: 

X = AS + N (15) 

where X, S and N are random vectors of respective sizes m X 1, n x 1 and m X 1, 
and A is an m x n matrix. Multiplying S by A linearly mixes the n sources resulting 
in m observed mixture processes corrupted by additive instrumental Gaussian noise N. 
Independent Component Analysis methods were developed to solve the BSS problem, 
i.e. given a batch of T observed samples of X , estimate A and S, relying mostly on the 
statistical independence of the source processes. 

Algorithms for blind component separation and mixing matrix estimatio n depend 



on th e a priori model used for the probability distri butions of the sources Cardoso. 



2001 



2001 



) although rather coarse assumptions can be made (Cardoso, 



1998; 



Hvvarinen et al 



). In a first set of techniques, source separation is achieved in a noise- less set- 



ting, based on the non-Gaussianity of all but possibly one of the co 



mainstream ICA techniques belong to this category : JADE (Cardoso, 



mpon ents. Most 



1999), FastICA, 
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Infomax IjHvvarinen et all 120011 . In a second set of blind techniques, the components 
are modeled as Gaussian processes and, in a given representation (Fourier, wavelet, etc.), 
separation requires that the sources have diver se, i.e. non proport i onal, variance p rofiles. 



The S pectral Matching ICA method (SMICA) l|Delabrouille et al 



2003; 



Moudden et al 



20051) considers in this sense the case of mixed stationary Gaussian components in a noisy 
context : moving to a Fourier representation, the point is that colored components can 
be separated based on the diversity of their power spectra. 



5.2. SMICA: from Fourier to Wavelets 

SMICA, which was designed to address some of the general problems raised by 
Cosmic Microwave Background data analysis, has already shown significant success 



for CMB spectr a l esti mation in multidetector experiments ( Dcl abrouille et al 



Patanchon et al 



2003: 



20041) . However, SMICA suffers from the non locality of the Fourier 



transform which has undesired effects when dealing with non-stationary components or 
noise, or with incomplete data maps. The latter is a common issue in astrophysical data 
analysis: either the instrument scanned only a fraction of the sky or some regions of the 
sky were masked due to localized strong astrophysical sources of contamination ( compact 
radio-sources or galaxies, strong emitting regions in the galactic plane). A simple way 
to overcome these effects is to move instead to a wavelet representation so as to benefit 



from the localization property of wavelet filters, which leads to wSMICA l|Moudden et al 



20051) . The wSMICA method uses an undecimated d trous algorithm with the cubic box- 



spline IjStarck and Murtae± . 



2002) as the scaling function. This transform has several 



favorable properties for astrophysical data analysis. In particular, it is a shift invariant 
transform, the wavelet coefficient maps on each scale are the same size as the initial 
image, and the wavelet and scaling functions have small compact supports in the initial 
representation. All of these allow missing patches in the data maps to be handled easily. 

Using this wavelet transform algorithm, the multichannel data X is decomposed into 
J detail maps X™ and a smooth approximation map XJ +1 over a dyadic resolution 
scale. Since applying a wavelet transform on 1|15[) does not affect the mixing matrix A, 
the covariance matrix of the observations at scale j is still structured as 



RX(j)=ARS(j)A*+R»(j) 



(16) 



18 



Wavelets, Ridgelets and Curvelets on the Sphere 



where R^j(j) and (j) are the diagonal spectral c ovariance matrices in t he wavelet rep- 



resentation of S and ./V respectively. It was shown (jMoudden et al 



2005) that replacing 



in the SMICA method the covariance matrices derived from the Fourier coefficients by 
the covariance matrices derived from wavelet coefficients leads to much better results 
when the data are incomplete. This is due to the fact that the wavelet filter response on 
scale j is short enough compared to data size and gap widths and most of the samples 
in the filtered signal then remain unaffected by the presence of gaps. Using these sam- 
ples exclusively yields an estimated covariance matrix (j) which is not biased by the 
missing data. 



5.3. SMICA on the Sphere: from spherical harmonics to wavelets on the sphere 
(wSMICA-S) 

E xtending SMICA to deal w ith multichannel data mapped to the sphere is straightfor- 



ward ( Delabrouille et al 



2003). The idea is simply to substitute the spherical harmonics 
transform for the Fourier transform used in the SMICA method. Then, data covari- 
ance matrices are estimated in this representation over intervals in multipole number I. 
However SMICA on spherical maps still suffers from the non locality of the spherical 
harmonics transform whenever the data to be processed is non stationary, incomplete, 
or partly masked. 

Moving to a wavelet representation allows one to overcome these effects : wSMICA 
can be easily implemented for data mapped to the sphere by substituting the UWTS 
described in section[21to the undecimated a trous algorithm used in the previous section. 
In the linear mixture model l|15l) . X now stands for an array of observed spherical maps, 
S is now an array of spherical source maps to be recovered and N is an array of spherical 
noise maps. The mixing matrix A achieves a pixelwise linear mixing of the source maps. 
Applying the above UWTS on both sides of l|15|) does not affect the mixing matrix A 
so that, assuming independent source and noise processes, the covariance matrix of the 
observations at scale j is again structured as in l|16|) . Source separation follows from min- 
imizing the covariance matching criterion (|23[) in this spherical wavelet representation. 
A full description is given in Appendix 2. 
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5.4. Experiments 

As an application of wSMICA on the sphere, we consider here the problem of CMB 
data analysis but in the special case where the use of a galactic mask is a cause of 
non-stationarity which impairs the use of the spherical harmonics transform. 

The simulated CMB, galactic dust and Sunyaev Zel'dovich (SZ) maps used, shown o n 



Dclabrouillc et al 



(2003). 



the left-hand side of Figure ED were obtained as described in 
The problem of instrumental point spread functions is not addressed here, and all maps 
are assumed to have the same resolution. The high level foreground emission from the 
galactic plane region was removed using the Kp2 mask from the WMAP team website 1 . 
These three incomplete maps were mixed using the matrix in Tabled in order to simulate 
observations in the six channels of the Planck high frequency instrument (HFI). 



CMB 


DUST 


SZ 


channel 


1.0 


1.0 


-1.51 


100 GHz 


1.0 


2.20 


-1.05 


143 GHz 


1.0 


7.16 


0.0 


217 GHz 


1.0 


56.96 


2.22 


353 GHz 


1.0 


1.1 x 10 3 


5.56 


545 GHz 


1.0 


1.47 x 10 5 


11.03 


857 GHz 



Table 2. Entries of A, the mixing matrix used in our simulations. 



Gaussian instrumental noise was added in each channel according to model l|15|l . The 
relative noise standard deviations between channels were set according to the nominal 
values of the Planck HFI given in Table |21 

The synthetic observations were decomposed into six scales using the isotropic UWTS 
and wSMICA was used to obtain estimates of the mixing matrix and of the initial source 
templates. The resulting component maps estimated using wSMICA, for nominal noise 
levels, are shown on the right-hand side of Figure [TTT1 where the quality of the reconstruc- 
tion can be visually assessed by comparison to the initial components. The component 
separation was also performed with SMICA based on Fourier statistics computed in the 
1 http:/ /lambda, gsfc.nasa. gov/ product /map/ intensity _mask. cfm 



20 Wavelets, Ridgelets and Curvelets on the Sphere 



100 


143 


217 


353 


545 


857 


channel 


2.65xl0~ 6 


2.33xl0~ 6 


3.44xl0~ 6 


1.05xl0~ 5 


1.07xl0~ 4 


4.84xl0" 3 


noise std 



Table 3. Nominal noise standard deviations in the six channels of the Planck HFI. 

same six dyadic bands imposed by our choice of wavelet transform, and with JADE. In 
Figure ITU the performances of SMICA, wSMICA and JADE are compared in the par- 
ticular case of CMB map estimation, in terms of the relative standard deviation of the 
reconstruction error, MQE, defined by 

std(CMB(0,<p)-axCMB(0,<p)) 
MQE = S td(CMB(0 )¥ O) (17) 

where std stands for empirical standard deviation ( obviously computed outside the 

masked regions), and a is a linear regression coefficient estimated in the least squares 

sense. As expected, since it is meant to be used in a noiseless setting, JADE performs 

well when the noise is very low. However, as the noise level increases, its performance 

degrades quite rapidly compared to the covariance matching methods. Further, these 

results clearly show that using wavelet-based covariance matrices provides a simple and 

efficient way to treat the effects of gaps on the performance of source separation using 

statistics based on the non local Fourier representation. 



Figure 13. Left : zero mean templates for CMB (a = 4.17 x 10 -5 ), galactic dust (cr = 
8.61 x 10~ 6 ) and SZ (a = 3.32 x 10~ 6 ) used in the experiment described in section EH 
The standard deviations given are for the region outside the galactic mask. The maps on 
the right were estimated using wSMICA and scalewise Wiener filtering as is explained 
in Appendix 2. Map reconstruction using Wiener filtering clearly is optimal only in front 
of stationary Gaussian processes. For non Gaussian maps, such as given by the Sunyaev 
Zel'dovich effect, better reconstruction can be expected from non linear methods. The 
different maps are drawn here in different color scales to enhance structures and ease 
visual comparisons. 
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Figure 14. Relative reconstruction error denned by l|17|) of the CMB component map 
using SMICA, wSMICA and JADE as a function of the instrumental noise level in dB 
relative to the nominal values in tabic [21 The zero dB noise level corresponds to the 
expected noise in the future PLANCK mission. 



6. Conclusion 

We have introduced new multiscale decompositions on the sphere, the wavelet trans- 



form, the ridgelet transform and the curvelet transform. It was shown in 



Starck et al 



I 2004) that combining the statistics of wavelet coefficients and curvelet coefficients of flat 
maps leads to a powerful analysis of the non-Gaussianity in the CMB. Using the new 
decompositions proposed here, it is now possible to perform such a study on data mapped 
to the sphere such as from the WMAP or PLANCK experiments. For the denoising ap- 
plication, we have shown that the Combined Filtering Method allows us to significantly 
improve the result compared to a standard hard thresholding in a given transformation. 
Using the isotropic UWTS and statistics in this representation also allows us to prop- 
erly treat incomplete or non-stationary data on the sphere which cannot be done in the 
non-local Fourier representation. This is clearly illustrated by the results obtained with 
wSMICA which achieves component separation in the wavelet domain thus reducing the 
negative impact that gaps have on the performance of source separation using the initial 
SMICA algorithm in the spherical harmonics representation. Further results are available 
at 

\protect\vrule widthOpt\protect\href {http : //j starck . free . f r/mrs .hmtl}{http : //j starck. free . f r/mrs . hmtl} 

as well as information about the IDL code for the described transformations based on 



the Healpix package IjGorski et al 



2002). 
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Appendix 1: The Combined Filtering Method 

In general, suppose that we are given K linear transforms T± , . . . , Tk and let oik be 
the coefficient sequence of an object x after applying the transform Tk, i.e. ak = T^x. 
We will assume that for each transform Tk, a reconstruction rule is available that we will 
denote by , although this is clearly an abuse of notation. T will denote the block 
diagonal matrix with Tk as building blocks and a the amalgamation of the o±k- 

A hard thresholding rule associated with the transform Tk synthesizes an estimate Sk 
via the formula 

s k = T-'Siak) (18) 

where S is a rule that sets to zero all the coordinates of ak whose absolute value falls 
below a given sequence of thresholds (such coordinates are said to be non-significant). 

Given data y of the form y = s + o~z, where s is the image we wish to recover 
and z is standard w hite noise, we propose to solve the following optimization problem 



I Starck et al 



20011) : 

min [[TsU^, subject to s G C, (19) 
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where C is the set of vectors s that obey the linear constraints 

s > 0. 

(20) 

\TS-Ty\ <e; 

here, the second inequality constraint only concerns the set of significant coefficients, i.e. 
those indices [i such that a M = (Ty)^ exceeds (in absolute value) a threshold t^. Given a 
vector of tolerance (e M ), we seek a solution whose coefficients (Ts) M are within e M of the 
noisy empirical a^. Think of as being given by 

so that otp is normally distributed with mean (f,(p^) and variance cr^ = cr 2 ||^||2- I n 
practice, the threshold values range typically between three and four times the noise level 
(Tfj, and in our experiments we will put e M = c M /2. In short, our constraints guarantee 
that the reconstruction will take into account any pattern detected as significant by any 
of the K transforms. 

The Minimization Method 



We propose to solve Ijl9|) using the method of hybrid steepest descent (HSD) ( Yamada, 



2001). HSD consists of building the sequence 

s «+i = P( s ») _ A„ +1 Vj(P( S ")); (21) 

Here, P is the £2 projection operator onto the feasible set C, Vj is the gradient of 
equation 1 191 and (A„)„>i is a sequence obeying (A„)„>i 6 [0, 1] and lim„^ +oc A„ = 0. 
The combined filtering algorithm is: 

1. Initialize L max = 1, the number of iterations Ni, and S\ — L ]^ ax . 

2. Estimate the noise standard deviation tr, and set eu = f-- 

f s) 

3. For k = 1, .., K calculate the transform: a k = TkS. 

4. Set A = L m „, n = 0, and s n to 0. 

5. While A >= do 

- u = s n . 

- For k = 1, .., K do 

— Calculate the transform ak = T^u. 

— For all coefficients ctk,i do 

(s) 

• Calculate the residual rk,i = ct k \ — a^^i 

• if a k s ] is significant and | r fcji |> e ktl then a k j = a k s ] 
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• «fc,z = sgn{a k ,i){\ a k ,i | -A)+. 
- u = T,7 1 ak 

— Threshold negative values in u and s n+1 = u. 

— n = n + 1, A = A — §\, and goto 5. 



Appendix 2: SMICA in the Spherical Wavelet Representation (wSMICA-S) 

A detailed derivation of SMICA and wSMICA can be found in 



(2003) and 



Moudden et al 



Delabrouille et al 



I 2005) . Details are given here showing how the Isotropic 
Undecimated Wavelet Transform on the Sphere described in Section|21above can be used 
to extend SMICA to work in the wavelet domain on spherical data maps. We will refer 
to this extension as wSMICA-S. 



wSMICA-S objective function 

In the linear mixture model (|15f) . X stands for an array of observed spherical maps, S 
is an array of spherical source maps to be recovered and N is an array of spherical noise 
maps. The mixing matrix A achieves a pixelwise linear mixing of the source maps. The 
observations are assumed to have zero mean. Applying the above UWTS on both sides 
of (|15f) does not affect the mixing matrix A so that, assuming independent source and 
noise processes, the covariance matrix of the observations at scale j is still structured as 



RX(j)=ARS(j)At+R5(j) 



(22) 



where R^(j) and R^,(j) are the model diagonal spectral covariance matrices in the 
wavelet representation of S and N respectively. Provided estimates (j) of R* (j) can 
be obtained from the data, our wavelet-based version of SMICA consists in minimizing 
the wSMICA-S criterion: 



J+i 

3=1 



(rX(]),AR s Jj)^ +R%(3) 



(23) 



for some reasonable choice of the weights otj and of the matrix mismatch measure T>, 
with respec t to the full set of parame ters 9 = (A,R^„(i),Ry,(i)\ or a subset thereof. As 



discussed in 



Delabrouille et al 



( 2003) and 



Moudden et al 



1 2005), a good choice for T> is 



V KL (R ll R 2 ) = -(tr(R 1 R- 1 ) - logdet^EJ 1 ) - m) 



(24) 
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which is the Kullback-Lcibler divergence between two m-variate zero-mean Gaussian 
distributions with covariance matrices R± and Ri- With this mismatch measure, the 
wSMICA-S criterion is shown to be related to the likelihood of the data in a Gaussian 
model. We can resort to the EM algorithm to minimize (|23|l . 

In dealing with non stationary data or incomplete data, an attractive feature of 
wavelet filters over the spherical harmonic transform is that they are well localized in 
the initial representation. Provided the wavelet filter response on scale j is short enough 
compared to data size and gap widths, most of the samples in the filtered signal will 
then be unaffected by the presence of gaps. Using exclusively these samples yields an 
estimated covariance matrix (j) which is not biased by the missing data. Writing the 
wavelet decomposition on J scales of X as 

J 

x[$, v ) = xy +l {§,ip) + Y,xy[$M (25) 

and denoting lj the size of the set M.j of wavelet coefficients unaffected by the gaps at 
scale j, the wavelet covariances are simply estimated using 

= T^ XTWt^XfVunV (26) 

The weights in the spectral mismatch (|23|l should be chosen to reflect the variability of 
the estimate of the corresponding covariance matrix. Since wSMICA-S uses wavelet filters 
with only limited ov e rlap, in the case of complete data maps we follow the derivation 



Delabrouille et al 



1 2003) and take aj to be proportional to the number of spherical 
harmonic modes in the spectral domain covered at scale j. In the case of data with gaps, 
we must further take into account that only a fraction f3j of the wavelet coefficients are 
unaffected so that the aj should be modified in the same ratio. 



Source map estimation 

As a result of applying wSMICA-S, power densities in each scale are estimated for 
the sources and detector noise along with the estimated mixing matrix. These arc used in 
reconstructing the source maps via Wiener filtering in each scale: a coefficient ip) 
is used to reconstruct the maps according to 

§J?(tf )¥ >) = {A^R^Ur^A + RiU)- 1 )- 1 x ftR»V)- l XV>(0,<p) (27) 
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In the limiting case where noise is small compared to signal components, this filter reduces 
to 

§?(0, <p) = ($RZ (j)" 1 !)" 1 !^ (j^XfW, <p) (28) 

Clearly, the above Wiener filter is optimal only for stationary Gaussian processes. For 
non Gaussian maps, such as given by the Sunyaev Zel'dovich effect, better reconstruction 
can be expected from non linear methods. 



